Picture for Haoli Bai

Haoli Bai

Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations

Add code
Feb 03, 2026
Viaarxiv icon

OVD: On-policy Verbal Distillation

Add code
Jan 29, 2026
Viaarxiv icon

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Add code
Jan 26, 2026
Viaarxiv icon

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study

Add code
Jan 21, 2026
Viaarxiv icon

Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats

Add code
Jan 14, 2026
Viaarxiv icon

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

Add code
Jan 07, 2026
Viaarxiv icon

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Add code
Dec 27, 2025
Viaarxiv icon

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search

Add code
Dec 21, 2025
Viaarxiv icon

A1: Asynchronous Test-Time Scaling via Conformal Prediction

Add code
Sep 18, 2025
Figure 1 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 2 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 3 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 4 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Viaarxiv icon